Search results for "Nearest-neighbor chain algorithm"

showing 5 items of 5 documents

Parallelized Clustering of Protein Structures on CUDA-Enabled GPUs

2014

Estimation of the pose in which two given molecules might bind together to form a potential complex is a crucial task in structural biology. To solve this so-called "docking problem", most algorithms initially generate large numbers of candidate poses (or decoys) which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates ranges from thousands to millions, performing the clustering on standard CPUs is highly time consuming. In this paper we analyze and evaluate different approaches to parallelize the nearest neighbor chain algorithm to perform hierarchical Ward clustering of protein structures usin…

CUDASpeedupComputer scienceNearest-neighbor chain algorithmParallel computingCluster analysisRoot-mean-square deviationPoseWard's methodHierarchical clustering2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
researchProduct

A Greedy Algorithm for Hierarchical Complete Linkage Clustering

2014

We are interested in the greedy method to compute an hierarchical complete linkage clustering. There are two known methods for this problem, one having a running time of \({\mathcal O}(n^3)\) with a space requirement of \({\mathcal O}(n)\) and one having a running time of \({\mathcal O}(n^2 \log n)\) with a space requirement of Θ(n 2), where n is the number of points to be clustered. Both methods are not capable to handle large point sets. In this paper, we give an algorithm with a space requirement of \({\mathcal O}(n)\) which is able to cluster one million points in a day on current commodity hardware.

CombinatoricsCURE data clustering algorithmSUBCLUNearest-neighbor chain algorithmCorrelation clusteringSingle-linkage clusteringHierarchical clustering of networksGreedy algorithmComplete-linkage clusteringMathematics
researchProduct

Improving Nearest Neighbor Based Multi-target Prediction Through Metric Learning

2017

The purpose of this work is to learn specific distance functions to be applied for multi-target regression problems using nearest neighbors. The idea of preserving the order relation between input and output vectors considering their corresponding distances is used along a maximal margin criterion to formulate a specific metric learning problem. Extensive experiments and the corresponding discussion try to put forward the advantages of the proposed algorithm that can be considered as a generalization of previously proposed approaches. Preliminary results suggest that this line of work can lead to very competitive algorithms with convenient properties.

Cover treeComputer scienceNearest neighbor search0211 other engineering and technologies02 engineering and technologyk-nearest neighbors algorithmBest bin firstMargin (machine learning)Nearest-neighbor chain algorithmMetric (mathematics)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingAlgorithmLarge margin nearest neighbor021101 geological & geomatics engineering
researchProduct

An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering

2002

Abstract A generalized prototype-based classification scheme founded on hierarchical clustering is proposed. The basic idea is to obtain a condensed 1-NN classification rule by merging the two same-class nearest clusters, provided that the set of cluster representatives correctly classifies all the original points. Apart from the quality of the obtained sets and its flexibility which comes from the fact that different intercluster measures and criteria can be used, the proposed scheme includes a very efficient four-stage procedure which conveniently exploits geometric cluster properties to decide about each possible merge. Empirical results demonstrate the merits of the proposed algorithm t…

Single-linkage clusteringcomputer.software_genreComplete-linkage clusteringHierarchical clusteringk-nearest neighbors algorithmArtificial IntelligenceNearest-neighbor chain algorithmClassification ruleSignal ProcessingCluster (physics)Computer Vision and Pattern RecognitionData miningMerge (version control)computerSoftwareMathematicsPattern Recognition
researchProduct

SparseHC: A Memory-efficient Online Hierarchical Clustering Algorithm

2014

Computing a hierarchical clustering of objects from a pairwise distance matrix is an important algorithmic kernel in computational science. Since the storage of this matrix requires quadratic space with respect to the number of objects, the design of memory-efficient approaches is of high importance to this research area. In this paper, we address this problem by presenting a memory-efficient online hierarchical clustering algorithm called SparseHC. SparseHC scans a sorted and possibly sparse distance matrix chunk-by-chunk. Meanwhile, a dendrogram is built by merging cluster pairs as and when the distance between them is determined to be the smallest among all remaining cluster pairs. The k…

sparse matrixClustering high-dimensional dataTheoretical computer scienceonline algorithmsComputer scienceSingle-linkage clusteringComplete-linkage clusteringNearest-neighbor chain algorithmConsensus clusteringmemory-efficient clusteringCluster analysisk-medians clusteringGeneral Environmental ScienceSparse matrix:Engineering::Computer science and engineering [DRNTU]k-medoidsDendrogramConstrained clusteringHierarchical clusteringDistance matrixCanopy clustering algorithmGeneral Earth and Planetary SciencesFLAME clusteringHierarchical clustering of networkshierarchical clusteringAlgorithmProcedia Computer Science
researchProduct